Assembly Language
©
Copyright Brian Brown, 1988-2000. All rights reserved.
| Notes | Home Page |
ASSEMBLY LANGUAGE PROGRAMMING, Part 5
MODERN 16 BIT MICROPROCESSORS
[8086] In the code examples so far,
we have separated out the coded instructions from the data. Modern processors
like the 8088 have separate registers which deal with each section of a program.
CS and IP = instructions DS, BX, SI= data ES, BX, DI= extra data SS, SP, BP= stack
In writing programs for modern processors like the 8088, the program is structured with a minimum of three sections, called SEGMENTS. The three segments represent the CODE, DATA and STACK areas of the program. Information within each segment is accessed differently depending upon the segment type. To access data in the stack segment requires the use of the SS, SP and or BP registers. The following diagrams illustrates how information in the stack and data segments are accessed.
Special assembler directives are used to specify the different segments
SEGMENT DIRECTIVES
The following directives illustrate how to
define the three basic segments for an 8088 assembly language program.
.STACK 100H .DATA .CODE
The value following the stack directive specifies the size of the stack segment.
The programmer is responsible for initializing the segment registers DS and ES to the correct segments of the program. Failure to do so will result in a program which will not access the data and extra data segments properly. The operating system will only initialize the CS, SS, SP and IP registers.
The following code portion illustrates how to setup the data segment register. This is performed at the beginning of the code segment.
.STACK 100H .DATA .CODE MOV AX, @DATA ; initialize DS MOV DS, AX
DIFFERENT SIZED MEMORY MODELS
The 8088 processor supports several
different memory models. We shall look at the most common types.
.MODEL SMALL
.MODEL LARGE
Use this memory model for all your programs.
SUPPORT FOR DIFFERENT CPU TYPES
The following directives are used
to specify the processor type.
.186 .286 .386 .8087 .8086
RETURNING TO PCDOS
When an assembly language program running under
PCDOS terminates, it must return to the operating system so that the user shell
program can be re-loaded. The correct format is to use the following code
sequence
mov ax, 4c00h int 21h
ASSEMBLER DIRECTIVES FOR IBM-PC PROGRAMS
The following is a
discussion of the assembler directives applicable to packages like Microsoft
Masm and Turbo Assembler. These packages are used to write machine code programs
which run under PCDOS.
name EQU expression
An absolute symbol represents a 16bit value; an alias is a name that represents another symbol. The declared name must be unique, one that has not been previously declared.
pi EQU 3.14159 clearax EQU xor ax,ax
The first example directs the assembler to replace every occurrence of the name pi with the value 3.1459, whilst the second example instructs the assembler to replace every occurrence of clearax with the instruction xor ax,ax
name DB initialvalue,,,
The name portion is optional.
value1 DB 16 form DB 6*2 text DB "Enter your name:"
In the first example, value1 is assigned a byte, and is initialized to 16, the second example sets form equal to 12 and assigns it a byte, and in the last example, text is defined as a sequence of bytes which each contain a character from the specified string. The first byte will be initialized to 'E', whilst the last byte will be initialized to a space character.
name DW initialvalue,,,
The name portion is optional.
DW ? mess DW 'ab'
The first example allocates one word of storage, but does not define its initial value (?). The second example defines mess as a word initialized with the character string 'ab'.
Strings when using the DW directive must not contain more than two characters. The 'b' will be placed in the low-order byte, and the 'a' will be placed in the high order byte. If only one character is specified, the high-order byte will contain 00H. The low-order byte appears FIRST for Intel Processors.
TITLE Graphics
This appears at the top of each page in the assembler list file, after the source file name.
NAME Calculate_Gross
When assembly is taking place, and the page directive is encountered, the assembler generates a form-feed character to set a new page, and continues the assembly on the new page. In this way, the programmer can organize a printout of modules on a per page basis, so that the printout of more than one module per page does not occur.
PAGE 66,132 ; 66 lines per page ; 132 characters wide PAGE ; go to new page in list file
name PROC codetype .... ret name ENDP
The last instruction in a procedure is a RETurn instruction. The codetype is FAR for large memory models, NEAR for small memory models. A procedure must be entered using the appropriate CALL instruction.
The DQ directive defines a quad word [8bytes] of storage for double precision floating point numbers.
The DT directive defines 10bytes of storage. This is normally used for Packed BCD numbers and a 10 byte temporary real floating point value, as this storage format is also used by the 80x87 arithmetic co-processor.
.DATA temp db 10 mess db 'Hi there','$' .CODE start: mov ax, @data mov ds, ax mov ah, 9h mov dx, OFFSET mess ;1 byte in .DATA segment int 21h ;print message mov ax, 4c00h ;return to PCDOS int 21h END start
SAMPLE PROGRAM FOR IBM-PC
TITLE Doscall ;Doscall.asm source file .MODEL SMALL CR equ 0ah LF equ 0dh EOSTR equ '$' .stack 200h .datamessage db 'Hello and welcome.' db CR, LF, EOSTR .code print proc near mov ah,9h ;PCDOS print function int 21h ret print endp start: mov ax, @data mov ds, ax mov dx, offset message call print mov ax, 4c00h int 21h end start
The program is assembled by typing
$ TASM DOSCALL Turbo Assembler V1.0 Copyright(c)1988 by Borland International Assembling file: DOSCALL.ASM Error messages: None Warning messages: None Remaining memory: 257k $
This produces an object file named DOSCALL.OBJ which must be linked to create an executable file which can run under PCDOS.
$ TLINK DOSCALL Turbo LinkV2.0 Copyright (c) 1987, 1988 Borland International $
The program when run, produces the following output.
$ DOSCALL Hello and welcome. $
MACROS
The macro directive allows the programmer to write a named
block of source statements, then use that name in the source file to represent
the group of statements. During the assembly phase, the assembler automatically
replaces each occurrence of the macro name with the statements in the macro
definition.
Macros are expanded on every occurrence of the macro name, so they can increase the length of the executable file if used repeatably. Procedures or subroutines take up less space, but the increased overhead of saving and restoring addresses and parameters can make them slower. In summary, the advantages and disadvantages of macros are,
Advantages
Disadvantages
In large programs, produce greater code size than
procedures
When to use Macros
MACRO DEFINITION
Defining Macros is done as follows,
name MACRO [optional arguments] statements statements ENDM
Consider the following macro to return to PCDOS from an assembly language program.
exittodos MACRO mov ax,4C00h int 21h ENDM
Macros are expanded when the program is assembled. This means that every occurrence of the macro name (apart from the definition) is replaced by the statements in the macro definition. An example will demonstrate this.
TITLE dosmacro .MODEL small exittodos MACRO mov ax,4C00h int 21h ENDM .STACK 100h .DATA message DB 'Hello and Welcome', '$' .CODE start: mov ax, @data mov ds, ax mov ah, 9h mov dx, OFFSET message int 21h exittodos END start
When assembled, the macro is replaced and the internal representation of the file looks like,
TITLE dosmacro .MODEL small exittodos MACRO mov ax,4C00h int 21h ENDM .STACK 100h .DATA message DB 'Hello and Welcome', '$' .CODE start: mov ax, @data mov ds, ax mov ah, 9h mov dx, OFFSET message int 21h mov ax,4C00h int 21h END start
Macros can also accept values (parameters).
addup MACRO ad1,ad2, ad3 mov ax, ad1 mov dx, ad2 mov cx, ad3 ENDM
In this example a macro named addup is created. It accepts three parameters, ad1, ad2 and ad3. The code which follows, consisting of the mov statements, will be used to replace every occurrence of the macro name addup in the source file. The macro is terminated with the ENDM statement.Calling a macro with arguments is done as follows,
addup bx, 2, count
This has the effect of loading the ax register with the contents of the bx register, the dx register with the value 2, and the cx register with the value of count.
Macro definitions may include other macro names, and macros may also be recursive: they can call themselves, eg,
pushall MACRO reg1, reg2, reg3, reg4, reg5, reg6 IFNB <reg1> ;; If parameter not blank push reg1 ;; push one register and ;; repeat pushall reg2, reg3, reg4, reg5, reg6 ENDIF ENDM pushall ax, bx, si, ds pushall cs, es
This shows a recursive macro called pushall that continues to call itself until it encounters a blank argument. In effect, it pushes the registers specified in the macro call onto the stack.
The ;; indicates that the comment field of the macro should not be expanded with the macro statements.
IMPLEMENTING FP NUMBERS, ARRAYS, RECORDS AND JUMP TABLES
Floating Point Numbers
The following example shows the declaration
of a single precision floating point decimal number (stored in IEEE 754
standard).
FPnum1 DD 1.32740
BCD strings
The following example declares a packed BCD constant.
BCDval DT 123456
Ten bytes are allocated, giving a number range of 0 to 99,999,999,999,999,999,999.
HANDLING ARRAYS
Arrays and array elements are dealt with using
pointers. This involves either based or indexed addressing.
1: Load a base/index register with the address of the first element 2: Calculate the offset position of the required element (1 byte for characters, 2 bytes for integers etc) 3: Perform the operation by either a) incrementing the base/index register by the required amount b) use based indexed addressing eg, X := IntArray[4]; mov bx, offset IntArray ; base address mov ax, 4 ; calculate offset mul ax, 2 mov si, ax mov X, [bx + si]
FOR Loop := 1 to 10 do BEGIN sum := sum + IntArray[Loop] END; initfor:mov ax, 1 ; Loop := 1 mov Loop, ax mov bx, offset IntArrat ; setup base register for: mov ax, Loop cmp ax, 10 ja forexit mov ax, Loop ; calculate offset mul ax, 2 mov si, ax mov ax, [bx + si] mov cx, sum ; add sum and intArray[Loop] add ax, cx mov sum, ax ; update sum jmp for forexit:
Integer Arrays
Integer arrays occupy two bytes per element. A
typical operation is to sum the contents of an integer array. The following code
for an 8086 shows this.
TITLE IntArray .MODEL Large .STACK 200h .DATA mess db 'The total is ','$' result dw ? IntArry dw 10, 34, 76, 25, 14, 9, 3, 22 IntAlen dw ($ - IntArry) / 2 buff db 6 dup( 20h ) db '$' .CODE binasc proc far ; convert result to ascii string mov ax, 0 mov ax, [result] ; get number to convert push ax ; save it mov si, offset buff[5] ; point to string area mov cx, 10 ; divide base factor shl ax, 1 ; clear sign bit shr ax, 1 do1: cmp ax, 10 ; compare with base fact jb exit1 mov dx, 0 ; clear upper numerator div cx ; divide by base factor add dl, 30h ; convert to ASCII mov [si], dl ; and store it dec si ; next character jmp do1 exit1: add al, 30h ; convert last character mov [si], al ; and store it pop ax ; recover or ax, ax ; and test for sign bit jns exit2 dec si ; store '-' sign mov bl, 2dh mov [si], bl exit2: ret binasc endp start: mov ax, @data mov ds, ax mov [result], 0000h ; clear result mov cx, IntAlen ; count of elements mov bx, offset IntArry ; point to IntArry mov si, 0000h ; first element xor ax, ax ; clear total lp1: add ax, [bx + si] ; add value to total inc si ; next element inc si dec cx jne lp1 mov [result], ax ; store total mov dx, offset mess ; print message mov ah, 9h int 21h call binasc ; convert result to ASCII mov dx, offset buff mov ah, 9h int 21h mov ax, 4c00h ; exit to DOS int 21h END start
Other typical operations involve the determination of the minimum and maximum values.
Records (Structures)
Records in Pascal support the use of different
sized field items. Consider the storage of the following record.
Var example_record = RECORD int_number : integer; fp_number : real; letter : character; END;
The same record is implemented in assembly language by first defining its composition.
ex_rec STRUC int_num dw fp_num dd lett db ex_rec ENDS
The next step creates a record which has the composition of the previous records definition.
my_rec ex_rec <22, 3.2, 'Hi there.$'>
Each field of the record is accessed in a similar method to that of Pascal, eg,
ex_rec.lett
accesses the lett field of the record ex_rec. The following program shows an implementation for the 8088 processor.
TITLE Records .MODEL Large ex_rec STRUC int_num dw fp_num dd mess db " " ex_rec ENDS .STACK 200h .DATA myrec ex_rec <22,1.30, "Hello there.$"> .CODE start: mov ax, @data mov ds, ax mov dx, offset myrec.mess mov ah, 9h int 21h mov ax,4c00h int 21h END start
Jump Tables
Jump tables are an efficient method of implementing
switch/case type statements. A jump table consists of an array of addresses.
Using an offset into the array selects the address of the routine which handles
that particular value.
Jump tables are efficient, because it always take the same time to select any routine from the table. The order may be re-arranged or new routines added simply be increasing the size of the table.
The following program implements a jump table.
TITLE Jump.asm .MODEL Large .STACK 200h .DATA help db 'This program exits when a function key is pressed.' db 10, 13, 'Ctrl A generates underline.', 10, 13 db 'Ctrl B generates bold.', 10, 13 db 'Ctrl C generates blinking.', 10, 13 db 'All other control codes return to normal text.', 10, 13 db 10, 13, 'Start typing characters.', 10, 13, '$'attrib db 07h ; screen attribute byte ; a table of addresses used to decipher recieve control codes ; each entry is the address of the appropriate routine ctl_tbl label word dw ctrl_null ; 0 dw ctrla ; 1 dw ctrlb ; 2 dw ctrlc ; 3 dw ctrld ; 4 dw ctrle ; 5 dw ctrlf ; 6 dw ctrlg ; 7 dw ctrlh ; 8 10 dw ctrli ; 9 11 dw ctrlj ; a 12 dw ctrlk ; b 13 dw ctrll ; c 14 dw ctrlm ; d 15 dw ctrln ; e 16 dw ctrlo ; f 17 dw ctrlp ; 10 20 dw ctrlq ; 11 21 dw ctrlr ; 12 22 dw ctrls ; 13 23 dw ctrlt ; 14 24 dw ctrlu ; 15 25 dw ctrlv ; 16 26 dw ctrlw ; 17 27 dw ctrlx ; 18 30 dw ctrly ; 19 31 dw ctrlz ; 1a 32 dw ctrl_lbkt ; 1b 33 dw ctrl_bslash ; 1c 34 dw ctrl_rbkt ; 1d 35 dw ctrl_carat ; 1e 36 dw ctrl_ul ; 1f 37 .CODE bumpcur proc far ; move cursor right one character mov ah, 3 xor bh, bh int 10h ; read int dh, dl inc dl ; next column cmp dl, 80 ; end of line? jle short bpcur1 xor dl, dl ; go to start of next line inc dh cmp dh, 24 ; end of screen? jl short bpcur1 mov ax, 0601h ; then scroll up xor cx, cx push dx mov dh, 24 mov dl, 80 mov bh, [attrib] int 10h pop dx mov dh, 24 ; position bottom linebpcur1: xor bh, bh ; set cursor position mov ah, 2 int 10h ret bumpcur endp ctrl_code proc far ; process Control CODES push bx cbw ; convert AL to AX mov bx,ax ; use bx and an index into shl bx,1 ; the ctrl_tbl jmp ctl_tbl[bx] ; jump to key routine ctrla: and byte ptr [attrib], 0f9h ; underline jmp ctrl_exit ctrlb: or byte ptr [attrib], 08h ; bold jmp ctrl_exit ctrlc: or byte ptr [attrib], 80h ; blink on jmp ctrl_exit ctrld: ; all others normal ctrl_null: ctrle: ctrlf: ctrlg: ctrlh: ctrli: ctrlj: ctrlk: ctrll: ctrlm: ctrln: ctrlo: ctrlp: ctrlq: ctrlr: ctrls: ctrlt: ctrlu: ctrlv: ctrlw: ctrlx: ctrly: ctrlz: ctrl_lbkt: ctrl_bslash: ctrl_rbkt: ctrl_carat: ctrl_ul: mov byte ptr [attrib], 07h ; normal attribute ctrl_exit: pop bx ret ctrl_code endp start: mov ax, @data mov ds, ax mov ah, 9h ;print help message mov dx, offset help int 21 hlp1: mov ah, 06h ; read character from keyboard mov dl, 0ffh int 21h jz lp1 ; repeat if character not ready cmp al, 00h ; if function key then exit je exit cmp al, 32 ; else if control code jae disp1 call ctrl_code ; then process control code jmp lp1 disp1: push bx xor bx, bx ; page zero on video memory mov bl, [attrib] ; get character attribute mov cx, 1 ; one character to write mov ah, 9 ; write char + attribute int 10h ; use BIOS call call bumpcur ; next cursor position jmp lp1 ; repeat exit: mov ax, 4c00h int 21h END start